First- and Second-Order Expectation Semirings with Applications to Minimum-Risk Training on Translation Forests
نویسندگان
چکیده
Many statistical translation models can be regarded as weighted logical deduction. Under this paradigm, we use weights from the expectation semiring (Eisner, 2002), to compute first-order statistics (e.g., the expected hypothesis length or feature counts) over packed forests of translations (lattices or hypergraphs). We then introduce a novel second-order expectation semiring, which computes second-order statistics (e.g., the variance of the hypothesis length or the gradient of entropy). This second-order semiring is essential for many interesting training paradigms such as minimum risk, deterministic annealing, active learning, and semi-supervised learning, where gradient descent optimization requires computing the gradient of entropy or risk. We use these semirings in an open-source machine translation toolkit, Joshua, enabling minimum-risk training for a benefit of up to 1.0 BLEU point.
منابع مشابه
Discriminative Training and Variational Decoding in Machine Translation via Novel Algorithms for Weighted Hypergraphs
A hypergraph or “packed forest” is a compact data structure that uses structure-sharing to represent exponentially many trees in polynomial space. A probabilistic/weighted hypergraph also defines a probability (or other weight) for each tree, and can be used to represent the hypothesis space considered (for a given input) by a monolingual parser or a tree-based translation system (e.g., tree to...
متن کاملExternal Factors and Iranian EFL Teachers’ Performance: Examining the Effectiveness of Self- regulation
Purpose: This paper follows a two-fold objective: First it examines the relationship between the external factors of compensation, support, empowerment, boundaries and expectations, pre-service and in- service training and Iranian EFL teachers’ performance. Second, it searches for the moderating effect of self-regulation on the relationship between teachers’ external assets and their performanc...
متن کاملاثرات سیلاب و آتشسوزی بر برخی ویژگیهای خاک جنگل لاکان در استان گیلان
Flooding and fire are important phevent which could impact the forests of north of Iran periodically. These phenomena could have undesirable effects on properties and quality of soil. This study was conducted in order to investigative the effects of flooding and fire on some soil properties in Lakan forest, Guilan province. Soil sampling was carried out on three replicates from three depths 0-3...
متن کاملOptimal Forest Road Density Based on Skidding and Road Construction Costs in Iranian Caspian Forests
Information on the productivity, costs and applications of the logging system is a key component in the evaluation of management plans for the rehabilitation and utilization of Caspian forests. Skidding and road construction costs are expensive forest operations. Determining the optimum forest road network density is one of the most important factors in sustainable forest management. Logging me...
متن کاملApplications of higher order shear deformation theories on stress distribution in a five layer sandwich plate
In this paper, layerwise theory (LT) along with the first, second and third-order shear deformation theories (FSDT, SSDT and TSDT) are used to determine the stress distribution in a simply supported square sandwich plate subjected to a uniformly distributed load. Two functionally graded (FG) face sheets encapsulate an elastomeric core while two epoxy adhesive layers adhere the core to the face ...
متن کامل